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ABSTRACT 

Consider  the  p-dimensional  unit  cube  [0,l]p,  p  _>  1. 

Partition  [0,1] p  into  n  regions,  R^n',,,,\n  such  that  the 

volume  A(R.  _) is  of  order  n”^,j  =  l,...,n.  Select  and  fix 
3  /n 

a  point  in  each  of  these  regions  so  that  we  have 

Suppose  that  associated  with  the  j-th  predictor  vector  xjn^ 

-*J 

there  is  an  observable  variable  Y^  ,  j  »  l,...,n  satisfying 


the  multiple  regression  model  yV 


g(x  fn) )  +  e  ,  where  g  is 


an  unknown  function  defined  on  [0,1JP  and 


md  {e(.n^> 


are  independent 


identically  distributed  random  variables  with  Ee^  =  0  and 
Var  e^  =  o2  <  ».  This  paper  proposes 

g  (x)  -  a"pT"  .Y(!*L  k[(x-u)/a  ldu  as  an  estimator  of  g(x)  , 

n  -  n  j— J-  j  -  *  n  - 

where  k(u)  is  a  known  p-dimensional  bounded  density  and 


{an>  is  a  sequence  of  reals  converging  to  0  as  n  00 . 

Weak  and  strong  consistency  of  gn(x)  and  rates  of  convergence 

are  obtained.  Asymptotic  normality  of  the  estimator  is  established 

Also  proposed  is  =  n-*£ (Y ^  -gR  (x^n^ ) ) 2  as  a  consistent 
2 

estimate  of  a  . 


1 .  INTRODUCTION . 


CL 


A  statistical  problem  which  finds  a  wide  range  of  appli¬ 
cations  is  the  estimation  of  a  regression  function ,g (x)  =  E(Y|x), 


where  Y  is  a  dependent  variable  and  x  is  a  p*l  vector  of  regressors 
(p  >  1).  If  g{x)  is  specified  except  for  a  set  of  parameters, 
then  a  typical  estimate  for  g(x)  would  be  the  least  squares 
estimate  which  is,  of  course,  the  maximum  likelihood  estimate  if 
the  errors  are  normally  distributed  and  g(x)  is  linear.  But  if 
g(x)  is  completely  unknown,  it  would  be  desirable  to  estimate  g(x) 
by  a  method  that  would  provide  good  properties  of  the  estimate. 

When  x  is  univariate  (i.e.,  p=l) ,  such  a  method  is  proposed  and 
studied  by  Priestley  and  Chao  (1972) .  Their  estimate  has  been 
further  studied  by  Benedetti  (1977) ,  Cheng  and  Lin  (1981a, b) , 
and  Schuster  and  Yakowitz  (1979) ,  among  others.  The  Priestley- 
Chao  estimate  is  nonparametric  in  the  sense  that  the  conditional 
distribution  of  Y  given  x  is  not  specified.  This  estimate 
resembles  the  kernel  estimate  of  a  probability  density  function 
investigated  by  Rosenblatt  (1956) ,  Parzen  (1962) ,  and  many  others. 

^"~In  the  present  investigation,  an  estimate  of  g(x)  is  proposed 
when  there  are  at  least  two  independent  regressors.  This  is  not 
a  direct  generalization  of  the  Priestley-Chao  estimate.  Also 
presented  is  a  consistent  estimate  for  the  error  variance.  With 
the  aid  of  the  variance  estimate,  an  asymptotic  confidence  interval 


for  q-fx4-can  be  constructed. 


TT 


The  multiple  regression  function  model  we  discuss  here  may 
be  presented  as  follows:  Let  (0,1 1*3  denote  the  p-dimensional 


unit  cube  (p>l) .  Divide  the  unit  cube  into  n  mutually  disjoint 
and  totally  exhaustive  regions  R^  n, . . .  ,1^  n  such  that  the  volute 
of  Rj  n  converges  to  0  as  n  +  ■.  Fran  each  of  these  regions  select 
and  fix  a  point  so  that  we  have  where  Xjnt  Rj  n,  j  =  l,...n. 

Suppose  that  Y^,\ . .  ,4^  is  a  randan  sample  obtained  fran  the  following 
model; 

y(.n)=»  g(x?)  +  e.(n),  j  *  l,...,n,  (1.1) 

where  e|^. . . , are  independent  identically  distributed 

(iid)  random  variables  such  that  Ee^11^  0  and  Vare^*13  a^<co 

and  g(«)  is  an  unknown  p-dimensional  function  defined  on 

[0,1]**.  The  problem  is  to  estimate  g(x).  Let  A(R.  _)  denote  the 

~  Jrn 

volume  of  the  j-th  region  R.  ,  j  «  1, ...,n.  A  nonparametric 

J 

estimate  of  g(x)  is  defined  by: 

(1-2> 


where  k(u)  is  a  known  p-dimensional  probability  density 
,  ** 

function  satisfying  the  following  conditions: 

(i)  supk(u)  <  «,  (ii)  lim  |  |u|  |k(u)  «*  0, 

U  -  Mull--  - 

Where  ||  •  ||  is  the  Euclidean  distance  function. 

If  an  approximate  confidence  interval  for  g(x)  is 

2 

desired,  then  one  would  need  a  consistent  estimate  of  a  . 
One  such  estimate  may  be  given  by: 


n -1!1?  .  <*“' 
‘3-1  j 


-  9n(x<n)))2. 


(1.3) 


The  organization  of  this  paper  is  as  follows:  In  Section 
2,  weak  and  strong  consistency  and  their  rates,  and  asymptotic 
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normality  of  g  (x)  are  established.  The  conditions  required 
to  prove  the  consistency  of  gR(x)  are  n*uch  weaker  than  those  of 
Priestley  and  Chao  (1972) ,  Schuster  and  Yakowitz  (1979),  and 
Benedetti  (1977) ;and  the  methods  of  proof  are  different.  In 
Section  3,  the  (weak)  consistency  of  on  is  established.  Thus 
an  approximate  normal  confidence  interval  for  g(x)  can  be 
established.  Finally  in  Section  4  we  discuss  the  optimal  choice 
of  k(u) . 

In  the  rest  of  this  paper  we  employ  the  following  notations 

P  .  wpl  ^ ,  and  _D>  to  mean,  respectively,  convergence  in 

probability,  with  probability  one,  and  in  distribution.  Unless 

otherwise  specified,  hereafter,  all  integral  signs  will  mean 

multiple  integration.  For  easy  of  exposition,  we  shall  write 

R.  for  R.  and  suppress  all  superscripts  for  yfn^,  x ,  and 
3  3  3  -3 

e^  ,  j  =  1 ,  ...,  n  in  the  remainder  of  the  paper. 

2.  PROPERTIES  OF  q_ (x) . 

-  n  ~ 

In  this  section  some  basic  properties  of  gn(x)  are 

established.  Precisely  we  show  that  gn(x)  is  asymptotically 

unbiased  (Theorem  1) ,  weakly  consistent  (Theorem  2) ,  strongly 

consistent  (Theorem  3) ,  and  asymptotically  normal  (Theorem  6) 

We  also  demonstrate  that  the  rate  of  weak  consistency  is  of  the 
_0 

order  0(n  )  for  some  p  >  0. 

THEOREM  1.  If  max  A  (R . )  =  0(n_1)  ,  if  naJJ  -*•  08  as  n  ♦  •, 

1<_  j<n  J 

and  if  g(x)  is  continuous  on  [0,1)P,  then  for  each  x  e  [0,l)p, 
Egn(x)  g(x)  a£  n  -*■  «  .  (2.1) 


PROOF .  Note  that 

E9n(?>  “  anP^j-lst;j,/Kjktt5'S’/“nla"' 

where  du  -  du  ...  du  .  Thus , 

~  P 

| Egn (x) -g (x) |  <  IJjal/Ri [g(Xj)-g(u) )a“pk[(x-u)/an]du| 

+  lln  fp  Ig(u)-g(x)Ia"pk[(x-u)/a  ]du| 

D=1  3 

-  Iln+  I2nf  s*y-  (2-2) 

But  since  g(x)  is  uniformly  continuous  on  [0,1] p  and 

max  A (R . )  =  0(n-1)  then  for  sufficiently  large  n,  I.  can  be 
1<_  j<n  3 

made  arbitrary  small.  Note  also  that  as  n  -*■  ® 

I9n=  |  /  [g(u)-g(x)]a“pk[(x-u)/a  ]du|  -*■  0r  (2.3) 

21  {0,l]p  **  “  n  ~  *  n  - 

by  Lemma  2.1  of  Cacoullos  (1966)  provided  nap  ■+■  «>,  as 
n  *►  «  .  |  | 

THEOREM  2.  If  the  conditions  of  Theorem  1  hold  then  for  all 
x  e  [0,l]p, 

g„  (x)  — >  g(x)  as  n  -*■  ®.  (2.4) 

PROOF .  In  light  of  Theorem  1  we  need  only  to  show  that  for  all 

x  e  [0,1] p,  g  (x)  -  Eg  (x)— — >  0  as  n  -►  ®.  To  this  end  we  use 

the  following  result  of  Pruitt  (1966  *  Theorem  1) :  Let  {U  j}  be 

a  sequence  of  iid  random  variables  such  that  EU^  =*  0,  and  let 

Z  ■  Tj  -C  .U.  where  {C  ,}  is  an  array  of  constants  such 
n  Aj«l  n]  j  nj 

P 

that  lim  C  .  =  0  for  every  integer  j.  Then  Z  — >  0  if 
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and  only  if  max  C  .  -*•  0  as  n  ■+•  ».  Now  set 
l<j<n  n3 

U .  =  Y.  -  EY.  and  C  .  =  a”P/_  k[(x-u)/a  ldu.  (2.5) 

3  3  J  nJ  n  Kj  ~  -  n  _ 

Then  there  exists  a  positive  constant  C  not  depending  on  n  such 
that 

max  C  .  <  a_p  max  A(R.)supk(v)  <  C(nap)~*  -*■(),  as  n  -*■  «*. 
l<j<n  nD  n  l£j<n  3  v  '  n 

Thus  the  result  follows.  j | 

THEOREM  3.  Assume  that  the  conditions  of  Theorem  1^  are  in  force 
and  that  _>  Cn  ^p  for  some  C  >  0  and  6  e  (0,1).  If 

E  |  e^  | 1+1/®  <  «f  then  for  all  x  e  [0,l]p, 

gn (x)  yP1  >  g(x)  ,  as  n  -*•  ».  (2.6) 

PROOF.  We  use  another  result  of  Pruit  (1966,  Theorem  2),  in 

_  A 

which  it  is  stated  that  if  max  C  .  **  0(n  )  for  some  0  <  6  <  1 

1_<  j  <n  nj 

and  if  E  |  U1 1 ^  then  ■v^pl  >  0,  as  n  -*■  »,  where  cnj»  Uj 

and  Zn  are  as  defined  in  the  proof  of  our  Theorem  2.  Thus 

max  C  .  <  C/(nap)  =  0(n  ®)  if  we  choose  a  >  Cn”^1_e^p,  and 
l<j<n  n3  “  n  n  " 

the  desire  conclusion  follows.  | j 

It  is  possible  to  obtain  the  rates  of  convergence  in  weak 
consistency  (Theorem  2) .  First  we  state  and  prove  the  following 
lemma. 

LEMMA  1.  Suppose  that  there  exist  positive  constants  and  C2 


such  that,  for  all  n  >  1, 
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an  >  max(C1n“ot/p/  C2n~ (1“e) /p)  (2.7) 

for  some  0  <  a  <  6  <  1.  If  G(u)  =  P  ( |y][  -  EYj  >_  u)  ,  and  if 
t 

u  G(u)  ^  M  <  «  for  some  t  >  1  +  a/B,  then  for  every  e  >  0, 


P(|gn(x)  -  Egn(x)|  >  e)  =  0(n“p),  (2.8) 

where  p  =  8(t-l)  -  a  >  0. 


PROOF .  We  make  use  of  Theorem  1  of  Franck  and  Hanson  (1966) . 

Recall  the  definitions  of  {C  . } ,  {U .  >  and  from  the  proof 

n}  3  n 

of  Theorem  2.  Franck  and  Hanson  (1966)  show  that  if  for  sane  constants 

0  <  a  <  B,  T"  C  .  <  C-n®,  max  C  .  <  C.n-^,  and  if  for  some 

n3  -  3  lljln  n3  -  4 

t  >  1.  Y1?  ,C^.  <  cen-p  for  some  p  >  0  then  it  follows  that 
j—J-  nj  —  j 

P(|Zn|>^E]  =  0(n“p).  Identifying  cnj'  “  a~P/R  k Kx-u) /anl du  and 

U-  =  Y,  -  EY . ,  j  =  l,...,n.  we  see  that  if  a„  >  C,n-^~^^p 
3  3  3  n  —  2 

then  max  C  .  <_  C.n  ®  and  if  a  >  C.n-®^  ,  then 
l<j<n  nD  n  x 


I?  -.C.  =  k [ (x-u) /a  ] du  <  C*a~p  <  C,n°  . 

3=J-  n]  n  rn  nP  -  -  n  -  —  n  —  3 


a“p/ 

n  t'o,up 


n' 

Finally,  Y1}  .C*1 .  <  max  CtT1T1?  ,C  . 
1  "3  -  1<j<n  n3  ^3=1  n3 


<  C5n"0(t“1)+a  <  C5n“p^ 


where  p  =  $(t-l)  -  a. | | 


THEOREM  4 .  Assume  that  the  conditons  of  Lemma  1  are  in 
force.  Then  for  any  c  >  0  and  all  x  e  [0,1}P, 
Pt|gn(x)  -  g (x) |  >  e]  *  0 (n”p) 


(2.9) 


PROOF.  Since  Bq_  (x)  -  g(x)  *  0  as  n  ♦  *  for  all  x  e  (0,1]^ 

■  n  —  —  ^ 

we  have  for  n  sufficiently  large  that  jEgn(x)  -  g(x) |  <  e/2 
and  hence 

P[|gn<x)  -  g(x)  |  >  e]  <  P[|gn(x)  -  Egn(x)  I  >  e/2l  "  0(n“p)  , 

(2.10) 

by  Lemma  1 . | | 

We  can  also  establish  rates  of  convergence  in  the  mean 

2 

square  consistency,  i.e.,  the  rate  of  E[g_(x)  -  g(x))  . 

To  this  end  we  establish  the  following  lemma 

LEMMA  2.  Suppose  that  k(u)  is  such  that 

/u.  ...  u.  k(u)du  *  0  for  all  i, ,...,i,  =*  1,2,...  ,p  and 

h  j  '  “ - 1  3  — 

j  =  1, . . . ,M  -  1  and  / lu.  |  ...  |u.  |k(u)du  <  «  for  all 

11  1M 

i^, . . . ,iM  «  1,2, ... ,p.  Assume  also  that  all  partial  derivatives 
of  order  M  or  less  of  g(x)  exist  and  are  bounded  .  Then 

1  ""  p "  "  '  —  '  *  —  • _  i  ""  “ 

for  any  x  e  [0,l]p, 

[Egn(x)  -  g(x)]2  =  0(a2Mp).  (2.11) 

REMARK.  Under  the  present  setting  Lemma  2  holds  only  for  M  <  2. 
In  order  for  the  lemma  to  hold  for  M  >  2,  however,  the  kernel 
function  can  no  longer  be  a  probability  density  function.  It 
must  be  allowed  to  take  both  positive  and  negative  values.  Then 
Conditions  (i)  and  (ii)  for  k(.)  given  in  Section  1  must  be 
appropriately  modified.  This  phenomenon  is  also  noted  by 
Cacoullos  (1966)  in  ths,  ’  rnc  estimate  of  a  multivariate  density 


function. 


PROOF  OF  LEMMA  2.  Recall  that 

|Egn(x)  -  g(x)|  <  IZj-1/R> tg(Xj)  -  g(u)  ]a~pk[ (x-u)/an]du| 

+  |/tg(u)  -  g(x)]a~pkt(x-u)/a  ]du{  =  I.  + 
••  **»  xi  —  •»  *i  in 

say  (2.12) 

We  shall  show  that  I^n  =  0(a^p)  ,  i  =  1,2,  x  e  (0,l]p.  We 

shall  prove  that  Ij’^Ofa^),  an<^  note  ^at  *^n~0(a^P) 

be. shown  analogously.  Now,  writing. x'  =  (x.,...,_x  ), 

w  X  P 


10 


X2n  *  I-'c*  “  g(x)  Jk(w)dw| ,  (2.13) 

1"xi  xi  1-x  x 

C  = [ -  —  /  —  )x  ...  x [-  —P/  7E] .  Now,  using 

n  n  n  n 


Ai 

where  C*  =  [ -  - — ,  — i. 


n  n 


multidimensional  Taylor  expansion  we  see  that 

g(x-a„w)  -  g(x)  =  jr  g(1(x;-anw)  +  Jy 


(  0  \ 

where  g'  1 (x; 


3*g(x) 


(2.14) 


??*)  "  ^!=1  *•*  *1,-1  3x7  ...3x  *i  ^ 


i  1/  •  #  •  |M« 


Hence 


X2n  "  I iff  /k^)g(M)  (x-0anw;-anw)dw| 
aMP 

-  lir  /k(^)  lD(il'--*^iM;g)  (!x-0anw|)||w.  I  ...  [w.  | 


3Mg(x) 

Where  D(il . Vg><x>  -  and 

1 1 


|x  -0anw|  =  (|x1-0anw1|,...,|xp-0anwp|)  .  But 
JDU^  . . .  /iM;g)  (x)  |  for  all  x;  it  follows  that 


w  | (-l)"dw, 
M 


(2.15) 


I2n  i  c*anP‘ 


(2.16) 


Thus  the  lemma  follows. 


THE: OREM  5.  Assume  that  the  conditions  of  Lemma  2  are  satisfied. 
Then  for  all  x  e  [0,l]p  and  a  =*  0 (n-1/ {2M+1)p) 

~  -  n  it 

Efgn(x)  “  9(x)32  =  0  (n2M/(2M+1) )  .  (2.17) 


PROOF.  Note  that  E[g_(x)  -  g(x)J  <  Varg_ (x) + [Eg  (x) -g (x) ] 
— ■  —  lx  -w  ***  ■  n  •»  n  -  *•' 

Now  from  Lemma  2,  :[Egn(x)  -  g(x)]2  =  0 (H2M/2 (M+1) ) .  Thus  we 


need  to  evaluate  Varg  (x) 

.  II  "• 


vargn(x)  =  o2a^2pIj=1(/R^k[ (x-u)/an]du}‘ 


<  (a2a“2pmax  A(Rj)/  ^pk[ (x-u)/an]du 


n 


l£j  <n 


[0,1] 


<  CM-Pn"1  =  OltnaP)-1)  =  0  (52M/<2M+1) ) , 

(2.  18) 

.  .  .  -l/(2M+l)p.  ,, 

by  choosing  an  =  0 (n  ) • I  I 


As  for  the  asymptotic  normality  of  gn(x)  we  proceed  as 


follows:  Choose  the  regions  Rn  so  that  A(R^)  =  c^/n, 

«  rH 

j  =  l,...,n  where  c^,...,cn  are  positive  constants  with  2,j_icj  = 


.“Vp, 


and  that  sup  u  -  v||  =  0{n  r)  .  Write  c  .  *  min{c.}  and 

-  ~  mm  3 


u,veR_, 


c  =  max{c.}.  Assume  that  v_  =  E  e.  <  Then,  for  large  n 

max  t  31 


(naP)  3/2Ej=]E|a~P(Y^-EY..)  /R  k[  (x-u)/anJdu|  J 

=  (na"P)  3/2E3?_.e|y.-EY.  |3{/  k [ (x-u) /a  ] du }3 
n  3-1  33  Kj  -  -  n 

=  v3 {nanP) 3/2^j=l{/R.k[  x-u) /an]du)3 

<  Cv3(na;P)3/2(cmax/n)2/o^]pk[(x-u)/an]du 

1  C’  v3  (naj)'1^2, 


provided  that  k(.)  is  bounded.  Similarly,  for  n  sufficiently 
large,  and  k(.)  is  of  Lipschitz  of  order  6, 
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na£vargn(x)  =  na“pff= .^(Y.-EY.)  2{/R  k[  (x-u)/an]du>2 


=  o2{na"p)y!? 


(na~p)  Ij=1^R.k[(x-u)/an]du}' 


-  a2(na:p)S? 


(na~p) Ij=i^(Rj)k[ (x-Xj)/an]/R  k[ (x-u) /an] 


2rn  - 
“  a  ^ j=lcjan 


;P/R,k2 [ (x-u)/an]du 


>  o2c  .  /  k2  (u) du. 
—  min  Rp 


(2.19) 


o  2  2 

In  fact,  when  a(r^)  =  1/n  for  all  j,  (nap) VargR(x)  -*■  a  f k  (u)du. 
Hence  applying  Liapounov's  central  limit  theorem  (Loeve  (1963), 
p.277),  we  see  that 

(nap)1/2  [gn(x)  -  Egn(x)  k2 [ (x-u) /an]du}1/2 

converges  to  N(0,1)  in  distribution  as  n  tends  to  infinity  and 
that  from  Lemma  2  with  M  =  2 


(nap)1/2[Eg  (x)  -  g(x)]  =  0  ( (na^p) 1/2)  . 


(2.20) 


Hence  we  arrive  at  the  following  theorem. 

THEOREM  6.  Assume  that  E | e^| 2  <  <*>,  g(x)  has  bounded  second  partial 
derivatives.  If  k  is  Lip(g),  /u^fujdu  =  0,and  /  l^u..  |k  (u)  du  <  ®, 
all  i,  j  =  l,.«.,p,  if  nap-*-  °°  and  if  na^p  -*■  0  as  n  -*■  °°  r  then 


(naP)1/2[gn(x)  -  g(*)]/l«2I".1ejV2l1/2 


where 


V?  =  a”p/_  k2[(x-u)/a  ldu. 


->  N (0, 1)  ,  as  n  -*  ®, 


(2.21) 


Note  that  when  p  =  1,  the  estimate  gn(x)  1S  a 
competitor  to  that  of  Priestley  and  Chao  (19  72)  .  The  properties 
of  our  estimator  held  under  much  weaker  conditions  than  those 
of  Priestley  and  Chao  (1972)  and  Benedetti  (1977)  .  To  make 
this  remark  more  precise  let  us  define  a  multidimensional 
extension  of  the  Priestley  and  Chao  estimate  and  discuss 
briefly  its  properties.  Let 


gn(x)  =  a~pJ*=1Yj A  (Rj)k[  (x-^ )/*„].  (2.22) 

Note  that  if  p  =  1  and  we  select  Rj  =  ^xj-i'  xjl»  j“l».../n 

where  0  =*  x_  <  x,  <  ...  <  x„  =  1,  (2.22)  reduces  to  the 

oi  n 

estimate  proposed  by  Priestley  and  Chao  (1972).  They  prove  that  if 

g(x)  and  k(u)  are  both  Lipschitz  of  orders  a  and  B 

respectively,  if  maxA(R.)  *  0(n  ),  and  if  a  =  n 

l<j<n  3  n 

0  <  y  <  min(a,  ^— ■)  ,  then  gn(x)  — >  g(x)  for  all  x  e  [0,1], 

provided  that  g(x)  is  continuous  on  [0,1].  A  better  result 

can  be  obtained  for  gn(x)  ,  p  1  as  follows:  Consider 


E(gn(x)  -  gn(x))2  =  E{anP^j=lYj/R.[k[(^j)/an1  "  M(x-u)/anJ]du}2 

=  a2l "  .{a~p/_  [k  [(x-x. )  /a  ]  -  k[  (x-uj/al  ]du}2 
n  Kj  **  n  n 

*  {»nP^=l®  -  Mx-u)/anJ]du) 

3  (2.23) 

It  is  not  difficult  to  see  that  if  k(u)  is  Lipschitz  of  order 

8,  then  the  first  term  is  of  order  0 (n  (26+1) (28+2) p^ 

“28  - (28+2) o 

and  the  second  term  is  of  order  0 (n  a  *) ,  provided 

n 

that  max  A(R. )  =  0(n”^).  Thus  if  na^+^®^  -*•  «  as 
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n  -*■  *,  we  conclude,  in  view  of  Theorem  2,  that:  If  the  conditions 
of  Theorem  1  hold,  if  k(u)  is  Lipschitz  of  order  0  and  if 
na^+^^  +  »  then  g  (x)  — >  g(x)  as  n  00  for  all 
x  e  [0,l]p. 

Note  that  while,  the  above  result  improves  that  of  Priestley  and 

Chao  (1972)  it  is  still  weaker  than  that  of  Theorem  2,  but  the 

calculation  of  g  (x)  is  easier  than  the  calculation  of  g  (x) , 

since  the  latter  requires  evaluation  of  /_  k[(x-u)/a  ]du  which 

D 

may  not  be  easy  for  some  kernels  such  as  a  multivariate 
normal.  We  shall  show  in  Section  4,  however,  that  the  optimum 
choice  of  k(u) ,  is  a  p.d.f .  such  that  the  evaluation  of 
/_  k[(x-u)/a  ldu  is  not  difficult  whenever  R.  is  properly  devised. 

Rj  ««*«**  n  «•  j 

On  the  other  hand,  note  that  it  follows  from  the  above- discuss- 

.  ion  that  (nap)E(gn(x)-gn(x))2  =  0 (n~2B+1a^ ( 2B+1) p) .  Hence, 

if  na^2B+1^  «  as  n  -►  »,  if  the  conditions  of 

n 

Theorem  6  are  satisfied,  and  if  k(u)  is  Lipschitz  of  order 

•«* 

6,  then  (nap)1/2(gn(x)-g(x))/[a2^_  C.V2]1/2  -2->  N(0,1), 

n  **  j—J-  j  3 

as  n  -*■  « ;  compare  this  result  with  Theorem  2  of  Benedetti  (1977)  . 

We  close  this  section  by  noticing  that  the  estimate  9n(x) 

2 

gives  rise  to  a  different  estimate  of  a  ,  namely  we  can  define 


n 

j-1 


2 


(2.24) 


.  2 
In  Section  3,  we  shall  demonstrate  that  a  is  also  consistent 

n 

but  it  requires  stronger  conditions  than  those  needed  for  the 
consistency  of  a2. 


3.  CONSISTENCY  OF  0_  . 

- n 

2 

In  this  section  we  show  that  an  is  weakly  consistent. 
To  this  end  note  that  we  can  write 

+  2Jj-l(Yj-9(Jj,,(9n%>  -  *  Iln  +  r2n  +  ^n'  sa*' 


(3.1) 

-1  P  2 

Note  that  n  rln  — >  a  as  n  +  «  by  the  weak  law  of  large 

—  1  P 

numbers.  Thus  we  need  to  show  n  I.  — >  0,  i  =  2,3,  as 

m 

n  -*•  «.  A  stronger  conclusion  would  be  to  show  that 
n  ^EIjn  ‘>0»  i  =  2,3,  as  n  -*■  «.  But 


n**lEI2n  ■  n ~1{Ij=iVar  (gn(*j))  +  Zj=i[E%(Xj) -g(Xj)  ]2}  .  (3.2) 

In  view  of  (2.19),  napVarg  (x)  *  o2][!J_, c.V?  =  0(1),  as  n  -»  •. 

n  n  -  j— -L  j  j 

Hence  the  first  term  of  (3.2)  is  readily  seen  to  be  of  order 

0((nap)-^)  =  o(l)  if  nap  -*•  ®  as  n  -*■  <».  Next,  if  the 
n  n 

conditions  of  Lemma  2  hold  with  M  =*  2,  then  the  second  term 

of  (3.2)  is  of  order  0(a4p)  =  o(l).  As  for  n-1EI,  ,  we  have 

n  in 

n"lEI3n  -  2n"1Uj=1E1/2(Yj-g  (xj))2E1/2(gn(xj)-g(xj))?} 

<  0(n"1/2a^p/2)=  o(l),  (3.3) 


under  the  same  conditions  used  in  the  proof  of  (3.2)  .  Hence  we 
arrive  at 


THEOREM  7.  Assume  that  the  conditions  of  Lemma  2  are  satisfied 
with  M  =  2.  Then 


as  n  * 


(3.4) 


Note  that  we  can  derive  an  analogous  result  for  5*  under 


a  bit  stronger  conditions,  viz.,  write 


~2 


”lrn 


+  2n~^yr?  ,  (Y  .-g  (x.))(g  (x.)-g  (x.)) 
*•3=1  3  ^n  ~3  *n  ^n  -3 


But  a2  — >  a2  and  E(g  (x)-g  (x)  )2  — >  0  if  k(u)  is  Lipschitz 
n  n  ~  n  ~  - 


of  order  $  and  na^1+^^  ^  ».  Thus  we  can  easily  see  that  the 

n 


~2 


second  and  the  third  terms  of  the  above  expression  of  on  have 


expected  values  converging  to  0,  as  n  -►  80 . 


It  is  possible  also  to  obtain  the  second  mean  convergence 


2  2 

of  a  (and  thus  of  o  )  under  the  extra  assumption  that 
n  n 


2p  4 

nanr+  «  as  n  -*■  «  and  Ee^  <  «.  To  see  this,  we 


need  to  show  that  Varo^  -»•  0  as  n  -*•  ».  Now, 


Var(o^)  =  n-2{IJ.1V«rI(YJ-gn(*j))iJ  +  Ij;ij.Cov[  (Yj-gn(xj) ) 


-2 


{Yj*-gn^j*))1}  =  n  {Jin+J2nh  say 


(3.5) 


But 


Var[  (Y .-g  (x.) )  2]  =  Var(Y2)  +  Var(g2(x.))  +  4Var (Y .g„ (x .) ) 
3  n  -.3  3  n  -.3  j  n  ~3 


+  2Cov(Y2,g2(x.)) 

j  n  -  J 


-  4Cov (Y2 , Y^gn (Xj) )  -  4c°v(Yjgn(Xj) ,g2(Xj) ) 


(3.6) 


Whenever  max  A(R. )  *  0(n"^)  and  na2^  -► 
1<  j<n  j  n 


•  as  n 


it  is  not  difficult  to  obtain  that 
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Var(g2(x..))  =  0(n“2a~4p),  Var (Yj9n <xj ) )  =  0(n~1an2p), 

Cov(y2/g2(Xj)  )=0  (n“1a”2p),  Cov(Y2,Y;.gn(x;.) )  =  0  (ri  1anP)  ,  and 
Cov(g2(Xj),  Yjgn<Xj) )  =  0 (n3/2a"3p) . 

Hence  it  follows  that  n~2J ^Var  [  (Yj-gn  (x^) )  2]  -*■  0  as 
n  -*■  <».  Similar  but  perhaps  more  tedious  algegra  also  reveals 
that  n"2J^;.*Cov(  (Yj-gn(Xj) )  r  (Yj*-gn(Xj*) ) )  -*-0  as  n 

provided  that  na2p  -*■  00 . 

Finally  note  that  one  can  combine  Theorems  6  and  7  to 
conclude  that  an  asymptotically  normal  confidence  interval  for 
g(x)  can  be  constructed  with  the  limits 

i  v2"5!k'ivilW 

where  z  denote  the  upper  100 ct%  point  of  the  standard  normal 

2 

distribution  and  V..  is  given  in  (2.21)  . 

4.  OPTIMAL  CHOICE  OF  THE  KERNEL. 

We  now  proceed  to  find  the  kernel  k(u)  which  minimizes 

•m 

2 

the  mean  square  error  E (gn (x) -g (x) )  .  Note  that  since  the  present 
regression  estimation  problem  resembles  the  density  estimation 
problem,  it  is  not  surprising  that  the  optimal  choice  of  the  kernel 
function  for  our  problem  turns  out  to  be  exactly  the  same  as  that 
derived  for  the  density  estimation  problem.  For  the  latter  case 
when  k(u)  =  n?_^k(u^)  where  k(.)  is  a  bounded  univariate  p.d.f. 
such  that  |u|k(u)  +  0  as  |u|  -*■  ®,  see  Epanechnikov  (1969).  Assume 
for  the  remaining  of  the  study  that  A(R^)  =  1/n,  j  =  l,...,n.  Let 
k(t)  denote  the  characteristic  function  of  k(u),  i.e.. 


k(t)  =  /e*~  ^k(u)du. 


(4.1) 


Then  we  can- write 


Egn(x)  -  g(x)  =  (2ir)"p{^sslg(xj)/R  [/ei^' (^)k(anw)dw]du} 

-  tfwJ’P/e1-'?*  (w)dw,  (4.2) 

g  ~  - 

where  $g(w)  =  ^[0,l]Pel''~g^ d** 

Thus 

Egn(x)  -  g(x)  =  (2ir)”P{/ei^'^[R(a  w)I”  .g(x.)/p  e"^' ~du-*(w)  ]dw} 

n  -v  —  n  —  j“i  —  j  Kj  y  -  ** 

“  (2ir)"P{/ei~'-k(anw)  [[  "=1g  (x^ )  /R  e’1*  '~du7«g  (w)  )dw 

+  /e1-'?*  (w)  [k (a  w)  -  l]dw.  (4.3) 

g  -  n— 

Now,  if  g  is  bounded  and  continuous,  then  by  the  dominated 
convergence  theorem. 


n_ig(x^)/p  e1-  ~du  -*■  *  (w)  as  n 
D_J-  ~j  -  y  - 


•.  (4.4) 


Thus  the  first  term  in  the  right-hand-side  (rhs)  of  (4 . 3)  converges 


to  0  as  n-*».  Next,  if  there  exist  positive  r^,...,r  such  that 


V  *4 

(l-k(u)  ]/n|u.  |  •*-  k  ,  a  non  zero  constant,  as 

~  i=l  1  rl'***'rp 


|u||  -*•  0,  then  rlf...,r  are  called  the  characteristic 


exponents  of  k  and  k 


the  characteristic  coefficient. 


Thus 


a 


-iU*  i 


n 


(2ir)"p/e"i~,^„(w)  [k(anw)  -  l]dw 

y  n  <•»  — 


-  (27T)“P/ei^,^a(w) 

y  - 


k(an.w)-l  p 


P  . 

1!  aw, 
i-1  n  1 


l  P  r. 

-  (  n  | w,  |  x)dw 
ri  i“l  1 


k  ,  /e1*?’?  Jjw.  |%o (w)dw 


r,  since  Varg„  (x)  'v  o2  (nap)  ”Vk2(u)  du  for  n  sufficiently 
n  ~  —  n 


large  we  obtain  that 

E(g  (x) -g(x) ) ) 2  o2(nap)  Vk2(u)du  +  a 

Jl  —  -*>  —  XI  % 


2(1^) 


kr  r  h 

1  P 


ri . rPl 


(4.6) 


where  £r^  =  £?_^r^ .  Thus  the  rhs  of  (4.6)  is  minimized  by 


choosing  an  as  follows: 


a  »  { (pa2/n)  /k2 (u)du/2 (£r . ) Jk  h 

II  mm  mm  1  L1  f  •  •  •  f 

1  P 


rl . rp(x)  |2}l/(p+2lr.) 


(4.7) 


and  the  minimum  value  is 


,p  22  2Iri/(P+Ir.)  2^r./(p+^r.) 

if  »•  \s~r  1  *■  r  /  oT.  \  ni  1  1 


(p  +  2Xj=1ri)  {<7  /k^(u)du} 


[(2lr.)n] 


r.  , . . .  ,r  2/ (p+£r . ) 

*  |k  h  1  p(x)  |  1  ,  (4.8) 

rl'*,,,rp 

-2lr./(p+Er.) 

which  tends  to  0  at  the  rate  n  .  Now  suppose  that 

P 

among  the  special  class  of  kernels  k(u)  =  IT  k(u. ),  with 

i=l 

k(*)  a  known  bounded  p.d.f.,  with  r^=2,i=l,...,p  we  want 
to  find  the  one  that  minimizes  (4.8).  This  problem  is  precisely 
that  of  Epanechnikov  (1969)  whose  solution  is  found  to  be 
kn  (u)  **  (3/4)(l-u2)  ,  |u|  <  1  ,  =  0,  elsewhere. 
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